HW-ADAM: FPGA-Based Accelerator for Adaptive Moment Estimation
نویسندگان
چکیده
The selection of the optimizer is critical for convergence in field on-chip training. As one second moment optimizer, adaptive estimation (ADAM) shows a significant advantage compared with non-moment optimizers such as stochastic gradient descent (SGD) and first-moment Momentum. However, ADAM hard to implement on hardware due computationally intensive operations, including square, root extraction, division. This work proposed Hardware-ADAM (HW-ADAM), an efficient fixed-point accelerator highlighting hardware-oriented mathematical optimizations. HW-ADAM has two designs: Efficient-ADAM (E-ADAM) unit reduced resource consumption by around 90% related work. E-ADAM achieved throughput 2.89 MUOP/s (Million Updating Operation per Second), which 2.8× original ADAM. Fast-ADAM (F-ADAM) 91.5% flip-flops, 65.7% look-up tables, 50% DSPs F-ADAM 16.7 MUOP/s, 16.4×
منابع مشابه
FPGA based accelerator for parallel DBSCAN algorithm
Data mining is playing a vital role in various application fields. One important issue in data mining is clustering, which is a process of grouping data with high similarity. Density-based clustering is an effective method that can find clusters in arbitrary shapes in feature space, and DBSCAN (Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise) is a basic on...
متن کاملPacket Filtering for FPGA-Based Routing Accelerator
In this paper, we present a novel approach for Binary Decision Diagram based semantically extended representation of packet filters called Filter Decision Diagrams (FDD), used for efficient filter processing and lookup in a hardware accelerator that uses a lookup engine employing CAM and comparison instructions kept in SRAM. We present the most important operations for FDDs and also give some c...
متن کاملHW/SW Co-design for FPGA based Video Processing Platform
In this paper we present a Video Processing Platform (VPP) for rapid prototyping based on FPGA (Field Programmable Gate Arrays) architecture using EDK embedded system and Xilinx System Generator. This hardware/software co-design platform has been implemented on a Xilinx Spartan 3A DSP FPGA. The video interface blocks are done in RTL and the MicroBlaze soft processor is used as an embedded video...
متن کاملFully Parameterizable FPGA based Crypto-Accelerator
In this paper, RSA encryption algorithm and its hardware implementation in Xilinx’s Virtex Field Programmable Gate Arrays (FPGA) is analyzed. The issues of scalability, flexible performance, and silicon efficiency for the hardware acceleration of public key crypto systems are being explored in the present work. Using techniques based on the interleaved math for exponentiation, the proposed RSA ...
متن کاملHW/SW Codesign of FPGA-based Neural Networks
In this article, we present a HW/SW codesign approach for the implementation of multilayer perceptrons resulting in an embedded system that can be used in wide variety of applications. The motivation for the HW/SW codesign includes declining time-to-market and power constraints as well as increasing gap between silicon area and computational intensity. By utilizing codesign, hardware tasks −to ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Electronics
سال: 2023
ISSN: ['2079-9292']
DOI: https://doi.org/10.3390/electronics12020263